Self Determining Speaker Recognition by Three Level Segmental Processing Of Linear Prediction Residual
نویسنده
چکیده
This paper proposes a speaker specific source information at different levels.speaker recognition system exploits the source information (LP residual) present at different levels namely subsegmental, segmental &suprasegmental. The subsegmental analysis considers LP residual in blocks of 5 msec with shift of 2.5 msec to extract speaker information. The segmental analysis extracts speaker information by processing in blocks of 20 msec with shift of 2.5 msec. The suprasegmental speaker information is extracted by viewing in blocks of 250 msec with shift of 6.25 msec. The speaker recognizer studies performed using TIMIT (Texas Instruments and Massachusetts Institute of Technology) databases demonstrate that the segmental analysis provides best performance followed by subsegmental analysis. The suprasegmental analysis gives the least performance. However, the evidences from all the three levels of processing seem to be different and combine well to provide improved performance, demonstrating different speaker information captured at each level of processing. Finally, the combined evidence from all the three levels of processing together with vocal tract information further improves the speaker recognition performance.
منابع مشابه
Speaker Information using Subsegmental and Segmental Analysis of LP Residual
Linear Prediction (LP) residual mostly contains the excitation source information. This work analyzes the LP residual once using frame size of 5 ms (subsegmental) and another time using frame size of 20 ms (segmental), each with a shift of 2.5 ms. The residual frames are then subjected to nonparametric Vector Quantization (VQ) to store the unique excitation sequences for each speaker. The testi...
متن کاملOn the Usefulness of Linear and Nonlinear Prediction Residual Signals for Speaker Recognition
This paper compares the identification rates of a speaker recognition system using several parameterizations, with special emphasis on the residual signal obtained from linear and nonlinear predictive analysis. It is found that the residual signal is still useful even when using a high dimensional linear predictive analysis. On the other hand, it is shown that the residual signal of a nonlinear...
متن کاملSubsegmental, Segmental and Suprasegmental Features for Speaker Recognition Using Gaussian Mixture Model
In the feature extraction stage, features representing speaker information are extracted from the speech signal. In the present study LP residual derived from the speech data is used for training and testing and also processing of LP residual in time domain at subsegmental, segmental and suprasegmental levels. In the training phase, GMMs are built, one for each speaker, using the training data ...
متن کاملSpeaker recognition using residual signal of linear and nonlinear prediction models
This Paper discusses the usefullness of the residual signal for speaker recognition. It is shown that the combination of both a measure defined over LPCC coefficients and a measure deffined over the energy of the residual signal gives rise to an improvement over the classical method which considers only the LPCC coefficients. If the residual signal is obtained from a linear prediction analisys,...
متن کاملTime -frequency analysis of vocal source signal for speaker recognition
This paper investigates the importance of spectrotemporal characteristics of the source excitation signal for speaker recognition. We propose an effective feature extraction technique for obtaining essential timefrequency information from the linear prediction (LP) residual signal, which are closely related to the glottal excitation of individual speaker. With pitch synchronous analysis, wavele...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2012